Adding Column Operations to your Data Flow
The Column Operations nodes are used to edit, manipulate, or calculate values in the table columns. A range of functions allow you to perform basic edits like renaming columns, converting data types, and reordering columns in the table. You can also create calculation columns based on your own PQL expression, generate new date columns by adding or subtracting date parts from an existing date-time column, and more.
Calculated Columns vs Function Presets
The Calculated Column node is a free-form version of the other functions; the remaining functions offer shortcuts to standard column operations, while the calculated column is the more advanced option, producing values based on user-defined PQL expressions.
Configuring Column Operations
The Column Operation nodes can be connected to Select, SQL Query, Bottom N, and Top N nodes. Each required operation node should be connected to the node representing the table containing the columns to be manipulated.
Once connected to the Data Flow, the Column Operations node must be configured from its Properties panel.
Column Operation Nodes
The following Column Operation nodes can be connected to the Data Flow:
Column Manipulation
- Calculated Columns: Add a new column of values using free-form functions and capabilities based on given PQL expressions.
- Combine Columns: Combine multiple columns into one.
- Convert Columns: Perform data type conversion on columns.
- JSON Extract: Extract data in a JSON format.
- Rename Columns: Change column names.
- Reorder Columns: Edit the order of columns in the table.
- Split Columns: Split a column into multiple columns.
String Manipulation
- Replace: Replace a character, string, or substring.
- String Left: Include only a substring from the left.
- String Right: Include only a substring from the right.
- Substring: Include only a substring from a string column.
- Trim: Remove a blank space from the start or end of the string.
Date Manipulation
- Add Date Unit: add or subtract a date part from a date-time column.
- Date Difference: calculate the difference between two dates.
Math Functions
- Aggregate FX: perform aggregate functions on selected columns.
- Math FX: perform mathematical functions on a column.
- Trig FX: perform trigonometric functions on a column.
Common Properties
The following fields are present in the Properties panel when you have any of the preceding nodes selected on the canvas. There are also Properties that are specific to the nodes described above, see the linked pages for more information.
Result Properties
The Result Properties panel contains only one field: Resulting Table Name. This is the name for the resulting table. You can change this value, if required, or leave its default name.
Column Selection
Expand the Column Selection panel to update the column selection for the given table. By default, all columns in the table are selected. You can, however, remove columns by clearing their checkboxes. Columns that are not selected will not be copied to the new data model.
The Data Type icon (blue arrow above) indicates the data type for this column.
Tip: Hover your cursor over the Data Type icon to view the current name of the column as a tooltip. This is useful where the name is too long to fit in the available space.
Searching your columns
If you are interested in particular columns, click Search (yellow arrow) to open a search field.
Renaming your Columns
To rename your column in this view, either double-click the current name or click Edit (purple arrow). The updated name will be used downstream of this node in your Data Flow.
Note: You can see that the name of this column has been updated in the example above, because the Info icon (green arrow) is visible. Hover your cursor over this icon to see the column's original name as a tooltip. Click Reset (orange arrow) to return the column to its original name.
Set Variable Values
Expand the Set Variable Values window to select variables that you want to pass to the node. Click the Plus (+) sign and then select the relevant variable, the aggregation type (update with), and the relevant column. For information about creating and editing variables, see Variables Panel.
Metadata
Expand the Metadata panel and add the following metadata details.
Description
Add a description for this node. This is useful for keeping track of the Data Flow (ETL) process, especially if multiple users are working with the same Data Flow. The description is visible only in the Model app.
Validate
If you do not want to validate this node when you run the validation process, for example because it is under construction and temporarily contains some invalid scripting, you can clear the selection of the Validate checkbox. Recommended: Always leave the Validate checkbox selected.
Node ID
The unique ID for this node. Click the Copy icon to copy this ID to your clipboard.